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Abstract 

We present a deterministic exploration mechanism for sponsored search auctions, which enables the 
auctioneer to learn the relevance scores (Click-Through-Rates) of advertisers, and allows advertisers to 
estimate the true value of clicks generated at the auction site. This exploratory mechanism deviates only 
minimally from the mechanism being currently used by Google and Yahoo! in the sense that it retains 
the same pricing rule, similar ranking scheme, as well as, similar mathematical structure of payoffs. In 
particular, the estimations of the relevance scores and true-values are achieved by providing a chance 
to lower ranked advertisers to obtain better slots. This allows the search engine (the auctioneer) to 
potentially test a new pool of advertisers, and correspondingly, enables new advertisers to estimate the 
value of clicks/leads generated via the auction. Both these quantities are unknown a priori, and their 
knowledge is necessary for the auction to operate efficiently. We show that such an exploration policy 
can be incorporated without any significant loss in revenue for the auctioneer We compare the revenue of 
the new mechanism to that of the standard mechanism (i.e., without exploration) at their corresponding 
symmetric Nash equilibria{SNE) and compute the cost of uncertainty, which is defined as the relative 
loss in expected revenue per impression. We also bound the loss in efficiency (i.e. social welfare), as well 
as, in user experience due to exploration, under the same solution concept (i.e. SNE). Thus the proposed 
exploration mechanism learns the relevance scores while incorporating the incentive constraints from the 
advertisers who are selfish and are trying to maximize their own profits, and therefore, the exploration 
is essentially achieved via mechanism design. We also discuss variations of the new mechanism such as 
truthful implementations. 



1 Introduction 

1.1 Preliminary Background 

With the growing popularity of web search for obtaining information, sponsored search advertising, where 
advertisers pay to appear alongside the algorithmic/organic search results, has become a significant business 
model today and is largely responsible for the success of Internet Search giants such as Google and Yahoo!. 
In this form of advertising, the Search Engine allocates the advertising space using an auction. Advertisers 
bid upon specific keywords. When a user searches for a keyword, the search engine (the auctioneer) allocates 
the advertising space to the bidding merchants based on their bid values and quality scores/factors, and their 
ads are listed accordingly. Usually, the sponsored search results appear in a separate section of the page 
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designated as "sponsored links" above/below or to the right of the organic/algorithmic results and have 
similar display format as the algorithmic results. Each position in such a list of sponsored links is called 
a slot. Whenever a user clicks on an ad, the corresponding advertiser pays an amount specified by the 
auctioneer. Generally, users are more likely to click on a higher ranked slot, therefore advertisers prefer to 
be in higher ranked slots and compete for them. 

From the above description, we can note that after merchants have bid for a specific keyword, when 
that keyword is queried, the auctioneer follows two steps. First, she allocates the slots to the advertisers 
depending on their bid values. Normally, this allocation is done using some ranking function. Secondly, she 
decides, through some pricing scheme, how much a merchant should be charged if the user clicks on her ad 
and in general this depends on which slot she got, on her bid and that of others. In the auction formats for 
sponsored search, there are two ranking functions namely rank by bid (RBB) and rank by revenue^RSK) and 
there are two pricing schemes namely generalized first pricing{GFP) and generalized second pricing{GSF) 
which have been used widely. In RBB, bidders are ranked according to their bid values. The advertiser with 
the highest bid gets the first slot, that with the second highest bid get the second slot and so on. In RBR, the 
bidders are ranked according to the product of their bid value and quality score. The quality score represents 
the merchant's relevance to the specific keyword, which can basically be interpreted as the possibility that 
her ad will be viewed if given a slot irrespective of what slot position she is given. In GFP, the bidders are 
essentially charged the amount they bid and in GSP they are charged an amount which is enough to ensure 
their current slot position. For example, under RBB allocation, GSP charges a bidder an amount equal to 
the bid value of the bidder just below her. 

Formal analysis of such sponsored search advertising model has been done extensively in recent years, 
from algorithmic as well as from game theoretic perspective lISl [TTl l8l ITl ITTl l9l [TOl . In a formal setup, there 
are K slots to be allocated among (> K) bidders. A bidder i has a true valuation Vi (known only to the 
bidder i) for the specific keyword and she bids 6j. The expected click through rate of an ad put by bidder i 
when allocated slot j has the form Cij = jjCi, i.e., separable into a position effect and an advertiser effect. 
7j's can be interpreted as the probability that an ad will be noticed when put in slot j and it is assumed 
that 7i > 72 > • ■ ■ > 7i<: > 0. can be interpreted as the probability that an ad put by bidder i will be 
clicked on if noticed and is referred to as the relevance of bidder i. This is the quality score used in the RBR 
allocation rule mentioned earlier. The payoff/utility of bidder i when given slot j at a price of p is given by 
snj{vi — p) and they are assumed to be rational agents trying to maximize their payoffs. Further, in typical 
slot auctions, bidders can adjust their bids up or down at any time and therefore the auction can be viewed 
as a continuous-time process in which bidders learn each other's bids. If the process stabilizes, the result 
can then be modeled as solution of the static one-shot game of complete information, since each bidder will 
be playing a best-response to others' bids. 

As of now, Google as well as Yahoo! use schemes that can be accurately modeled as RBR with GSP. The 
bidders are ranked according to Cibi and the slots are allocated as per these ranks. For simplicity of notation, 
assume that the ith bidder is the one allocated slot i according to this ranking rule, then i is charged an 
amount equal to £i+i^i+i xhe revenue and incentive properties of this model has been thoroughly analyzed 
in the above mentioned articles. 

1.2 The need for exploration 

In the standard model described above, it is implicitly assumed that the auctioneer knows the relevance 
Cj's, but in practice, this is not entirely true as new advertisers do also join the game and the estimates 
for the advertisers getting lower ranked slots is also generally poor as they hardly get any clicks. Further, 
it is also assumed that the bidders know their true valuations accurately and bid accordingly, and high 
budget advertisers and low budget advertisers (e.g., mom-and-pop businesses) have similar awareness and 
risk levels. In reality, an advertiser might not know her true value and what to bid, and in particular a low 
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budget advertiser might be loss-averse (T6\i and may not be able to bid high enough to explore, due to the 
potential risks involved. Furthermore, in the sponsored search auctions, the value is derived from the clicks 
themselves (i.e. rate of conversion or purchase given a click), and therefore, unless she actually obtains a 
slot and receives user clicks, there is essentially no means for her to estimate her true value for the keyword. 
Certainly, a model that automatically allows one to estimate these key parameters (i.e. CTRs and true values) 
is desirable. 

1.3 Results in this paper and related work 

Our goal in this paper is to study the problem of learning relevance scores and valuations in a mechanism 
design framework while deviating only minimally from the mechanism being currently used by Google and 
Yahoo!. The problem of learning CTRs has also been addressed in lfT2l l6l[T3ir71. Our result is different from 
||12] in that the latter disregards the advertisers' incentives. The result in ["^l does consider the advertisers' 
incentive; however, its goal is not to study exploration in the mechanisms currently being used by search 
engines, but to implement a truthful mechanism that also learns the CTRs, and therefore, it had to deviate 
from the current pricing scheme. Our mechanism can also be easily adapted for truthful implementation via 
a new pricing scheme, and in fact, all the revenue analysis remains the same as we shall discuss later in the 
paper Study in [13] is empirical and that in [7 | is not exploration based, and restricts itself to a single slot 
case and does not consider advertisers' incentives. 

We recently learned about an independent study by Wortman et al. ifTSl along lines similar to ours, i.e., 
designing mechanisms for exploration that deviate minimally from the standard model without exploration 
and then comparing their respective incentive properties. Our mechanisms for exploration are, however, 
quite different and they originated from a different set of approaches. Indeed, a preliminary draft that 
includes all the main results presented in the current paper (although motivated a little differently) was 
posted in early July 2007 lfT4l . well before the work in ifTSll was made publicly available. As discussed in 
greater detail in the following, here are some of the distinctive features of our independent work: (i) Our 
exploration mechanism is a deterministic one, unlike a randomized one analyzed in [18|; (ii) We explicitly 
discuss how advertisers could estimate their true valuations under our exploration based mechanism. As 
argued before, true valuation is often unknown a priori, and has to be accurately estimated; (iii) Besides 
studying the loss in revenue due to exploration, we also explicitly discuss the loss in efficiency, as well 
as, loss in user experience due to exploration; (iv) The tools and approaches used in the analysis of our 
mechanism are very different from those presented in lITSl . and they highlight several interesting features 
of mechanism design and incentive analysis. For example, we show that the mathematical structure of 
payoffs in our exploration mechanism is the same as in the standard mechanism without exploration, which 
allows us to utilize results from the latter Thus, our approach represents an instance where reduction among 
mechanism design problems is being successfully used as an analytical tool. 

Moreover, as we discuss later in Section [H the problem of designing a family of optimal exploratory 
mechanisms, which for example would provide the most information while minimizing expected loss in 
revenue is far from being solved. The work in [ 18] and in this paper provide just two instances of mechanism 
design which do provably well, but more work that analyze different aspects of exploratory mechanisms are 
necessary in this emerging field. Thus, to the best of our knowledge, we are one of the first groups to 
formally study the problem of estimating relevance and valuations from incentive as well as learning theory 
perspective without deviating much from the current settings of the mechanism currently in place. 

In the following we summarize our results as well as the organization of the rest of the paper: 

1 . We design a deterministic exploration mechanism to learn the relevance scores by deviating minimally 
from the mechanism being currently used by Google and Yahoo ! in the sense that it retains the same 
pricing rule, as well as, similar ranking scheme. In particular, the estimation of the relevance scores is 
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achieved by providing a chance to lower ranked advertisers to obtain better slots. Qualitatively, some 
top slots are designated for exploration purposes and each of the advertisers whose relevance is to be 
estimated, is given an equal chance to appear in those slots. In Section |2j we formally introduce this 
exploration mechanism which we call Exp-GSP and the standard RBR with GSP mechanism without 
exploration is referred to as GSP. 

2. In Section |3l we study the incentive properties of Exp-GSP mechanism by modeling it as one shot 
static game of complete information, like in the case of GSPJUHTI. We show that the mathematical 
structure of the payoffs of the bidders in Exp-GSP is the same as in GSP, and therefore all the 
incentive analysis from GSP can be adopted for Exp-GSP. This further corroborates our claim that 
our exploration mechanism deviates only minimally from GSP and indeed our approach can also be 
understood as reduction among mechanism design problems. Furthermore, another interesting feature 
of our exploration mechanism is that the attention or the quality of service (in terms of position based 
CTRs i.e. probability of being noticed) provided to advertisers is still in the same relative order as in 
standard mechanism without exploration. 

3. It is clear that any exploration mechanism will incur some cost in terms of revenue compared to 
the case when we do not need an exploration. We formalize this cost via cost of uncertainty which 
is defined as the relative loss in expected revenue of the auctioneer per impression. To this end, 
we compare the revenue of the Exp-GSP to that of GSP at their corresponding symmetric Nash 
equilibria{SNE) and bound the cost of uncertainty. Our analysis confirms the intuition that a higher 
cost is incurred for better exploration i.e. there is a tradeoff between quality of exploration/estimation 
and the revenue. Nevertheless, the associated parameters can be tuned to ensure a suitable balance 
between these two conflicting needs- minimizing the loss in revenue while allowing for sufficient 
exploration to be able to estimate parameters such as the relevance scores. These revenue properties 
are studied in the Section |4l 

4. Section [5] discusses the loss in efficiency in Exp-GSP compared to GSP. As in the case of revenue, 
there is a tradeoff between efficiency (i.e. social welfare) and the quality of exploration/estimation. 
Additionally, our analysis also suggests that closer we are to the optimal efficiency (i.e. the case 
when the auctioneer knows true values of relevance scores and the advertisers know that of their 
valuations), lesser we lose in the efficiency due to exploration. This means that during several phases 
of the exploration the loss in the efficiency degrades. Similar observations can also be obtained for 
user experience which can be defined as the total clickability of all ads. 

5. In Section |6l we discuss how our exploration mechanism i.e. Exp-GSP can be used to estimate 
relevance scores and valuations, as well as, the quality of such estimation using Chernoff bound 
arguments. 

6. In all the Sections from [2] through [6l we restrict ourselves to a standard assumption in literature that 
the CTRs are separable. In Section |71 we remove this assumption and study some other variations of 
Exp-GSP. In particular, by imposing a new pricing rule we can turn our exploration mechanism to a 
truthful one. Moreover, a similar upper bound on the cost of uncertainty is established as in the case 
of Exp-GSP with separable CTRs. 

2 An exploration based Generalized Second Price mechanism 

In this section, we formally introduce our exploration mechanism. First we setup some notations and defi- 
nitions. 
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Notation: There are N advertisers/bidders bidding for a specific keyword and tliis keyword appears 
several times during a day. Tliere are K < N slots to be allocated among the bidders for this keyword. A 
bidder i has a true valuation Vi for this keyword and she bids bi. The expected click through rate of an ad 
put by bidder i when allocated slot j has the form CTRij = jjei, i.e., separable into a position effect and 
an advertiser effect wherein is the relevance of the bidder i. Further, it is assumed that jj > ^j+i for 
all j = 1,2, ... ,K and 7^ = for all j > K. The search engines' estimate of relevance Cj of bidder i is 
denoted by qi and bidder i's estimate of her relevance is denoted by fi. There are no budget constraints. 

Explore slots and tuning parameters: Auctioneer chooses two parameters n < N and L < K. Auc- 
tioneer designates top L slots for exploratory purpose. Let us call these slots as explore slots and slots L + 1 
through K will be called non-explore. Auctioneer decides a set F of n bidders whose relevance, she wants 
to estimate. As described in the mechanism below, these n bidders will be the top n bidders according to 
auctioneer's ranking rule. If auctioneer wants to just improve the estimate for some bidders, she chooses 
n < K and if she also wants to estimate the relevance of some new bidder or some left-out bidder, she 
chooses n > K + I. The parameters n and L are publicly known. Further, as we shall see below, the 
mechanism has n steps and during these n steps, the bidders in set F will be given equal chance to appear 
in the explore slots in the sense that they appear exactly once in each explore slot. During a step, when a 
bidder does not appear in one of the explore slots, she competes for non-explore slots with all the bidders 
who do not appear in the explore slots. Now we are ready to formally describe the new mechanism which 
we call Exp-GSP (Exploratory-Generalized Second Price). 

The Exp-GSP Mechanism: 

• Bidders report their bids 61, 62, . . . , 67V- 

• Ranking Bidders: Auctioneer uses R6R to rank the bidders i.e. she ranks the bidders in the de- 
creasing order of qibi. For clarity of notation, let us rename the bidders according to this ranking, i.e., 
bidder m is the one ranked m in this ranking. 

• Allocating Explore Slots: There are n steps in the mechanism and the n bidders in F are ordered in 
each step as follows. The ordering at step 1 is the above mentioned RBR ranking i. e. [1, 2, ■ ■ , L | 

(L + 1), • • • , n]. This order is cyclicly shifted towards left for n — 1 more steps. Thus the ordering 
in step 2 is [2, 3, • • • , L, (L + 1) [ (L + 2), • • • , n, 1] and that in step 3 is [3, • • • (L + 2) | (L + 
3), • ■ ■ , n, 1, 2] and so on. In a particular step, for j < L, the jth slot is assigned to the bidder having 
rank j in this cyclicly rotating ordering at that step. For example, in step 1, the slot j < L is allocated 
to the bidder j. In step 2, the slot j < L is allocated to the bidder j + 1 and in step n, first slot is 
assigned to the bidder n, and for 2 < j < L, the jth slot is allocated to the bidder j — 1. In a particular 
step, a bidder will be called explore-active if she is assigned one of explore slots in that step. Note 
that this cyclicly shifting rule ensures that during total of n steps, each of the n bidders in F gets to 
each explore slot exactly once, thus each one is explore-active for exactly L steps out of the n steps. 
Also, in each step there are exactly L explore-active bidders. 

• Allocating non-Explore Slots: Bidders from F who are not explore-active at a particular step along 
with bidders not in F, are allocated to non-explore slots as follows. Let ii < i2 < • ■ ■ < in-l 
be the bidders who are not explore-active in this particular step. Recall that we renamed the bidders 
according to the RBR ranking. Now the slot L -|- j for 1 < j < K — Lis assigned to the bidder ij. 
For example, in step 1, we have ij = L + j; in step 2 we have n = 1 and ij = L + j otherwise, and 
in step n we have ij = L + j — 1. 

• Payments based on GSP : A bidder i is charged an amount equal to per-click. 
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Nomenclatures: For the rest of the paper, we fix some nomenclatures. The standard one step mechanism 
with RBR ranking and GSP pricing will be referred to as GSP and the new exploration based mechanism 
described above ( all the n steps together) will be referred to as Exp-GSP. Further, we will refer 's to 
as position based cilck-through rates. Let Ij denote all the information about the bidder i i.e. /j includes 
bidder i's true relevance Cj, auctioneer's estimate of her relevance qi, her estimate of her relevance /j, her 
true value Vi and her estimate of her true value vi, all the knowledge of bidder i about the auction game etc. 
An instance of the GSP is represented by {N, K, (7^ ), (/j)) and that of Exp-GSP by {N, K, n, L, (7^- ), (/i)). 
Clearly, any given instance {N, K, (7^), (/j)) of GSP is equivalent to an instance {N, K, n, L, (7^), (Ij)) of 
Exp-GSP where n = I, L = 0. Further, as we show in Section [3l a large class of instances of Exp-GSP 
of our interest can also be mapped to instances of GSP with properly defined position based click-through 
rates. This corroborates our claim that we deviate minimally from the mechanism currently in place. 



3 Incentive properties 

In this section, we study the incentives properties of ?i-step Exp-GSP mechanism modeling it as one shot 
static game of complete information, where the advertisers know others' bids, and play the best response to 
others' bids given their current estimates of their CTR's and their true valuations. This is reasonable as the 
bidding process can be thought of as a continuous process, where bidders learn each other's bidsll5l[T7ll8ll9l. 
As we explain in the following, a large class of the instances of Exp-GSP can be mapped to instances of 
GSP with properly defined click-through rates and therefore will allow us to use the results on GSP. This 
corroborates our claim that we deviate minimally from the mechanism currently in place. The solution 
concept we will use is Symmetric Nash Equilibria(SNE)Aocally envy-free equilibria studied in ||5l[T7l. First, 
we define effective CTR which will help us mapping instances of Exp-GSP to that of GSP. 

Definition 1 Effective Click-Through Rates: Let li,l2, ■ ■ ■ ,ln be the slot positions that a bidder j is 
assigned in the steps 1,2, ... ,n of Exp-GSP respectively, then the effective CTR of a bidder i for slot 
j < N denoted as Cij is defined as ^^^=1 '^i,im- Thus for the separable case, the effective position based 
CTR/or slot j < N denoted 6j is Ylm=i 7^m- 

Intuitively, the effective CTR of a bidder i for slot j is the sum of the expected CTR of bidder i for each 
of the n step in Exp-GSP if he would have been ranked j. It is not hard to derive the following lemma. 

Lemma 2 Let 7 = J2j=i Ij ^^^^ 

y njrn ij m > n 

where 

{n- L-{m - l))-fL+m+ 

7L+1 + 7L+2 H h lL+m-1 i£m< L 

^ (m - L)7m + 7m+i H h7m+L-i+ ^2) 

™ {n — m — L + l)^rn+L^ L < m < n — L 

(m - L)7m+ 
. 7m+i H h7nif"^>?^-i 
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In the above lemma, 7 basically represents the effective position based click through that a bidder obtains 
from the explore slots (in n steps) and dm represents the effective position based click through that the bidder 
m obtains from the non-explore slots (in n steps). In particular, the dm indicates how many steps the bidder 
m spends in specific non-explore slots. For example, di = {n — L)^l^i indicates that the bidder 1 spends 
(n — L) steps in the slot numbered (L + 1), d2 = {n — L — 1)7^^+2 + 7l+i indicates that the bidder 2 
spends (n — L — 1) steps in the slot (L + 2) and one step in the slot (L + 1), and so on for other bidders. In 
the following lemma we observe that these effective position based CTRs are in fact strictly monotonically 
decreasing like 7j's. The proof is provided in the Appendix. 

Lemma 3 Let K = max{K, n}, n < min{i^ + 1,K + L}, and L < ^{n — 1) then 

0i> 02---> Oj^>0 

and 6i = Ofor all i > K. 

Now under Exp-GSP the payoff of the bidder m is 

Um — QmS-miym )• (3) 

0.1 

which has exactly the same functional form as in GSP where S^'s takes the place for 7m 's and therefore 
our name for S^'s makes sense. Thus an instance {N, K, n, L, (7^), (Ij)) of Exp-GSP where n < K + 1, 
and L<\{n- 1), can be mapped to an instance (A^, max{i(', n}, (Oj), (/j)) of GSP. We formalize this in 
the following theorem. 

Theorem 4 For each instance {N, K, n, L, (7^), (Ij)) of Exp-GSP with n < K+1, and L < ^{n—1), there 
is an instance (N, K, (7j), (li)) o/GSP such that the game induced by (N, K, n, L, (jj), (li)) is equivalent 
to the game induced by (N, K, {jj), (li))- In particular, N = N,K = max{n, K}, 7^ = 6j,Ii = li where 
9j 's are defined by Equations\I}and\2l 

It is interesting to note that even though we allowed lower ranked bidders to obtain top slots, the compe- 
tition for the non-explore slots keeps the effective position based CTRs still in the same relative order. 

The highest ranked bidder still gets the best service compared to others although her effective payoff might 
have decreased. A lower ranked bidder still gets relatively lower quality of service than the bidders above 
her although her payoff might have improved. This same structural form of payoffs allows us to derive The- 
orem|4]and therefore to utilize the results on GSP studied in ||5l[l7l[8]|9j|2l[T| and in particular the following 
theorem on existence of pure Nash equilibria for Exp-GSP. Thus our approach can also be understood as 
reduction among mechanism design problems. 

Theorem 5 There always exist a pure Nash equilibrium bid profile for the Exp-GSP. 

As noted in the above theorem, there always exist pure strategy Nash equilibria for the Exp-GSP auc- 
tion game. However, this existential proof does not give much insight about what equilibria might arise in 
practice. Edelmen et al ||5l proposed a class of Nash equilibria which they call as locally envy-free equilibria 
and argue that such an equilibrium arises if agents are raising their bids to increase the payments of those 
above them, a practice which is believed to be common in actual keyword auctions. Varian[17| indepen- 
dently proposed this solution concept which he calls as symmetric Nash equilibria(SNE) and provided some 
empirical evidence that the Google bid data agrees well with the SNE bid profile. In a similar way we can 
obtain the following observation. 
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Theorem 6 An SNE bid profile bi 'sfor Exp-GSP satisfies 

< {9i - 9i+i)viqi + 6i+iqi+2bi+2 (4) 

fi}rani = 1,2,...,N. 

Note that the Theorem [6] assumes that the bidders know their true valuations Uj's, however the theorem 
holds evenif it is not the case by replacing vi by bidder z's current estimate of her true valuation. 

Now, recall that in the Exp-GSP, the bidder i pays an amount '^'+^^'+1 per-click, therefore the expected 

payment i makes under Exp-GSP (in n steps) is 0^6^ '^''''^^'^^ = Thus the best SNE bid profile 

for advertisers (worst for the auctioneer) is minimum bid profile possible according to Theorem [6] and is 
given by 

K 

OiQi+ibi+i = ^{9j - 0j+i)vj+iqj+i. (5) 
j=i 

For the revenue comparison in the next section, we fix this minimum SNE bid profile as the solution 
concept. The same result essentially hold for the maximum SNE bid profile as well. 



4 Revenue comparison and the cost of uncertainty 

In this section we study the revenue properties of Exp-GSP and compare it to that of GSP. We first define 
the cost of uncertainty to formalize the loss of revenue due to exploration. 

Definition 7 Cost of uncertainty: Let Rq be the expected revenue of the auctioneer for GSP at its min- 
imum SNE and R be her expected revenue for Exp-GSP at the corresponding minimum SNE, then "cost 

of uncertainty" associated with the exploration is defined as — ^ — i.e. the expected relative loss in the 
revenue per impression and is denoted as p. 



Using Equation , we have 



and 



K K 



s=l j=s 



K K 



s=l j=s 
K K 



.8 = 1 j = S 



(7i -7j+i) - -i^j-^j+i) 



n 



Qj+iVj+i- 



By utilizing the relationship among 7j's and 9j's we can obtain the following theorem which provides a 
nice upper bound on the cost of uncertainty. The proof this theorem is provided in the Appendix. 
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Theorem 8 Let Rq be the revenue of auctioneer from top I bidders and Rq be her total revenue in GSP and 
let 

c= mm — . (6) 

l<j<n-L Jj - 7j+i 

then 



p{L, n) <ll- min{l, c}(l ) 



2L.1 / R] 



min{n,X} " 
■0 



n I \ Ri 



■0 



< <'l-min{l,c}(l-— )1. (7) 



n 



First, note that the above bound is when L = 0, indicating no revenue loss when there is no explo- 
ration. Further, given an n, as L increases the bound deteriorates confirming our intuition that higher cost 

is incurred for better exploration. Also for a given L, we can note that the factor ° is dominant and 

increases as n increases and therefore the bound deteriorates as n increases. We see that auctioneer can tune 
parameters L and n so as to improve revenue, smaller the L and n, better off the auctioneer is. But as the 
auctioneer also wants to get some valuable information so as to estimate parameters such as relevance of 
the advertisers and do also want to give flexibility to lower ranked bidders to figure out their valuations, she 
would like to keep L and n to be large. Therefore, the auctioneer can choose a suitable L and n to balance 
between these two conflicting needs. Furthermore, it is clear that a finer analysis will reveal much better 
revenue guarantee i.e. even smaller p. For example, usually the expression on right hand side of Equation 
[6] in the above theorem is dominated by j = 1, however if we look at the expression for revenue the j = 1 
term appears only once unlike all other j's and neglecting j = I does not noticeably change the difference 
in the revenues and therefore a better c might be achievable with this fine tuning. 



We can also note that Theorem [8] still holds true when we replace the RBR ranking rule in GSP and 
Exp-GSP by any weighted ranking rule (i.e. in the decreasing order of Wibi's) and change the payment 
rules accordingly (i.e. "''+^^'+^ per-click to the ith ranked bidder). 



5 Efficiency comparison 

Revenue is a natural yardstick for comparing different auction forms from the viewpoint of the seller (the 
auctioneer), however from a social point of view yet another yardstick that is natural and may be important 
is efficiency, that is, the social value of the object. The object should end up in the hands of the people who 
value it the most. The efficiency in the adword auction model is therefore the total valuation, and turns out 
to be the combined profit of the auctioneer and all the bidders. Let us denote the efficiency for the Exp-GSP 
as E and that for GSP as Eq then, 

K 

E GmdmVm, (8) 

m=l 
K 

-E'O = ^ ImemVm.- (9) 
in=l 

Using Lemma[2]and rearranging the terms in E we get. 
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Lemma 9 

K 

E=Y1 (10) 

m=l 

(11) 

where 

{n-m + l)em-LVm-L + YT=m-L+l ^i'^i (12) 

+ {m — L)emVm if L < m < n 
nemVm if m> n 

The above lemma allows us to bound the loss in efficiency due to exploration as we note in the following 
theorem whose proof is deferred to Appendix. 

Theorem 10 Let Eq = J2i=i Im^mVm, E^'^ = J27=L+i Im^mVm then the relative loss in efficiency per 
impression is 

^^{"-«(f)-'(f)} 

where 

^^1 — T^U^ — f (14) 

n ma.Xi<m<L emVm L<m<n [m-L<i<m \ emVm J ) 

First, note that the above bound is when L = 0, indicating no efficiency loss when there is no explo- 
ration. Further, given an n, as L increases the bound deteriorates and similarly for a given L, the bound 
deteriorates as n increases. Apart from the tuning parameters n and L, note that there is another interesting 
parameter i] which actually depends on the true relevance and the true values of the advertisers. In particular, 
it indicates that how far the current estimates are from the true ones. For example, in the extreme case when 
the auctioneer knows the true relevances, then the ordering by qmVm, will be equivalent to the ordering by 
emVm and rj will infact be 0, improving the bound. Thus closer we are to the optimal efficiency, lesser we 
lose in efficiency due to exploration. The proof of Theorem [TO] includes the following observation in the 
case when the ordering by qmVm is same as the ordering by emVm- 

Corollary 11 Under the assumption that emVm ^ ^m+iVm+i for all 1 < m < n the upper bound in 
TheoremllOlcan be improved to 



Eq J n \ Eq 



1 ll]r=l^i^i • (em-lVm-l .\ 

where a = — , uj = mm 1 . 

n eiVi L<m<n \ emVm J 

Now let us consider the effect on the user experience due to exploration. Following lO, the user ex- 
perience can be defined as the total clickability of all the ads i.e. how likely an user is to click on the ads 

altogether. Therefore, for GSP it is J2m=i Im^m and that for Exp-GSP it is J2m=i ^me-m- Clearly, similar 
observations in the loss of user experience due to exploration can be obtained as in the case of efficiency. 
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6 Estimating the relevance and valuations 



Let Mi be the number of clicks that the advertiser i receives in Exp-GSP then her relevance a is estimated 
as ^ and the deviation will not be high as can be argued using Chernoff bound arguments. Formally, let 
Mij be a — 1 random variable indicating whether the advertiser i gets a click in the jth impression (i.e. 
jth step in Exp-GSP) or not and Mi = Yl]=i Mij. Clearly, E[Mi] = J2]=i E[Mij] = OiCi. Then by 
Chernoff bound, for any < (5 < 1, we have 

Pr{\e,-^\>5e,)<2e-'-^^4. (15) 

A simple calculation implies that, we can get an estimate of within a 5 fraction with probability 1 — e 
as long as we have, 

e, > ^InC-). ,16) 

Normally we will be interested in estimating the relevance of lower ranked advertisers and clearly for 
them the value of Oi increase as we increase the value of L and we can guarantee a better estimation. In 
particular, given a value of L and n, we can have reliable estimation with probabilty 1 — e within a fraction 



of \J and an additive estimation within y j-ln{^). The above estimation can be improved even 
further by sampling from many phases of Exp-GSP. Note that even if we consider the I phases of Exp- 
GSP as a single shot game, the results of the sections |3] and |4] remains unchanged and in particular the 
cost of uncertainty does not change. As above using Chernoff-bounds arguments, we can obtain an additive 
estimation within 6 with probability 1 — e if we use / phases where 

' £ il-Mi)^ (17) 

Thus we can obtain an estimation negligibly (i.e. inverse polynomially in parameter n, L) close to the true 
value with probability exponentially close to 1 in polynomially many phases of Exp-GSP. We summarize 
the above observation in the following theorem. 

Theorem 12 The relevance of the advertiser i can be estimated within 5 with probability 1 — e by using I 
phases o/ Exp-GSP where, 



Even a single phase of Exp-GSP can provide pretty good estimate with probabilty 1 — e within ^ j-ln{j) 
of her true relevance. 

In a similar way, the advertisers can estimate their valuations. A reasonable way an advertiser can 
estimate her value is via tracking conversions i.e. which clicks lead to a purchase or an activity of the 
advertiser's interest. Let Xi be the value advertiser i derives from a single conversion and aj be the conversion 
probability per click and Qi be the total number of conversions she obtains in Exp-GSP then she can 
estimate her value to be -^Xi per click and using Chemoff-bound as above and union bound we can 

argue that this estimation is very good. Here /j is her updated estimate of her relevance using the cunent 
phase of Exp-GSP. In reality, it might be difficult to track conversions but it is not clear how can the 
advertiser estimate without the knowledge of her conversion rate. Further, it is also possible that she derives 
some values from impressions and clicks even though it does not lead to a conversion. For example, an 
impression gives some branding value and a click improves her relevance score even when they do not lead 
to a conversion. In this general case, let x[ , , xf be the values advertiser i derives from an impression, a 

click and a conversion respectively then she can estimate her value to be "'""^'^i ^^^^^ per click. 
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7 Variations of Exp-GSP: 

Truthful Implementation and non-separable Click-through rates 



Recall from Section [3] that the effective CTR of a bidder i for slot j denoted Cij is the sum of the expected 
CTR of bidder i for each of the n step in Exp-GSP if he would have been ranked j and in a similar way as 
for 0j's we can derive the following lemmas. 



Lemma 13 Let (3i = ^2^=1 '^ij f^^^ 



Ci 



nci^m ij m > n 



(18) 



where 



{n- L- {m - l))ci^L+m+ 

Ci,L+l + CiJ,4-'2 + ■ • • + CiJ,+ 



'i,L+2 H h Ci^L+m~l if m < L 



(m - L)cj,m + Cj,m+1 H h Ci,m+L-1 + 

{n — m — L + l)ci m+L if L<m<n — L 



(19) 



(m - L)ci,m+ 
^ Ci^rn+l H h Ci,n if m > n - L 

Lemma 14 Let K = max{A', n}, n < mi\i{K + 1,K + L}, and L < ^(n — 1) then for all 1 < i < N 

Ci,l > Cj,2 • • • > C- j^ > 

and Cij = Ofor all j > K. 

Consider any ranking based mechanism and the corresponding exploration based generalization as de- 
scribed in Section |2] with payment rule modified accordingly then the instances of the two mechanisms are 
given by {N, K, (cij), (/j)) and {N, K, n, L, (cij), (/j)) respectively. Therefore, using the Lemmas [T3l [141 
we can obtain a reduction similar to Theorem HI for each instance {N, K, n, L, (cij), (/j)) of exploration 
based mechanism with n < K + 1, and L < ^{n — 1), there is the instance (A^, max{n, K}, (cij), (/j)) of 
corresponding one step mechanism without exploration such that the game induced by (A'', K, n, L, (q j), (/j)) 
is equivalent to the game induced by {N, max{n, K}, (cij), (Li)), where Cij is given by the Equations [TSl 
[T9l Therefore, we can use all the results from one step mechanism without exploration. In the following we 
consider two variations of Exp-GSP - (i) for the given ranking mechanism the goal is to design a truthful 
mechanism and even allowing non-separable CTRs and we do so by introducing a new payment rule and 
utilizing results from HI via the above reduction, and (ii) where we restrict ourselves to the same ranking 
and payment rules but allow CTRs to be non-separable utilizing results from m via the above reduction. 



It is known that the GSP is not truthfulU] [51 [H and clearly this holds true for Exp-GSP as well. And as 
we mentioned in the Section [T] there is a result IH with a goal towards implementing a truthful mechanism 
while learning the CTRs, and to achieve this goal it had to deviate from the current pricing scheme. Our 
exploration based mechanism described in Section |2] can also be made truthful by changing the payment 
rule. All the description of the mechanism remains the same except the following: 

• The bidders are ranked by qibi where qi is the quality score the search engines defines for the bidders 
i. For example, usual choices of qi are search engines' estimate of Cj^i or that of ^f=i Cij. 
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• The bidder i is charged an amount per-click pi given by, 



In spirit of ID, we call this variation of our exploration mechanism as Exp-Laddered and it can be 
proved to be truthful by adopting the proof in |T|. We refer the usual one step truthful mechanism without 
any exploration to as Laddered. Now let us compute the cost of uncertainty in this truthful implementation 
and as will see below we can obtain a similar upper bound as in Section |4] Let Rq be the expected revenue 
of the auctioneer for Laddered and R be her expected revenue for Exp-Laddered then 

^o = x;e(c„-q,-,i)^^^ (21) 

1=1 ]=t 



^ = EE(^M-c..+i)^^ (22) 

1=1 j=i * 

Performing calculations as in Section |4l we can obtain the following theorem. 
Theorem 15 Let 

c = mm mm — (23) 

l<i<mm{n,K} i<j<n~L Cij — Cj.j+i 

then the "cost of uncertainty" associated with truthful implementation is upper bounded by 

or \ 

l-min{l,c}(l ) . (24) 

n J 

Note that the Theorem [15] is consistent with Theorem |8] when we assume CTRs to be separable i.e. 



Now we consider the variation of Exp-GSP where we restrict ourselves to the same ranking and pay- 
ment rules but allow CTRs to be non-separable. If there were no restrictions on the ranking rule, following 
||T5l l4l[3l we could argue that there would always exist Walrasian equilibria and in particular such an equlib- 
rium where every bidder pays her opportunity cost. This equilibrium is called MP pricing equilibrium 
as at this equilibrium every bidder obtains her marginal product as her payoff. But there exists ranking 
rules for which there is no MP pricing equilibrium [Jj. As Laddered is unique truthful mechanism given a 
weighted ranking rule, whenever MP pricing equilibrium exists which is compatible with the ranking rule 
in Exp-GSP, every bidder's payment is the same as in Exp-Laddered and therefore the expected revenue 
of the auctioneer at minimum SNE of GSP and Exp-GSP are same as for Laddered and Exp-Laddered 
respectively. Thus the cost of uncertainty is the same as in the case of truthful implementation and is given 
by Theorem [15] The existence of Walrasian equilibria (not necessarily the MP pricing) can be explicitly 
proven for the ranking used in Exp-GSP utilizing the results from |2|, but unfortunately it does not have a 
nice analytical form unlike in the seperable CTRs case or in the truthful case and analytical computaton of 
cost of uncertainty does not seem feasible. However, intuition from the earlier section indicates that similar 
results should hold as in Section [4] 

It is clear that the estimation results from Section [6] can easily be extended for both the variations of 
Exp-GSP discussed above and we omit the detailed discussion. 
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8 Concluding remarks 



We proposed a deterministic exploration mechanism to learn the relevance scores by deviating minimally 
from the mechanism being currently used by Google and Yahoo! in the sense that it retains the same pric- 
ing rule, as well as, similar ranking scheme. We show that such an exploration policy can be incorporated 
without any significant loss in revenue for the auctioneer. An independent work reported in [ 1 8 1 introduces 
a randomized exploratory mechanism and analyzes its incentive properties. We demonstrate that the math- 
ematical structure of the payoffs in our proposed exploratory mechanism (EXP-GSP) is identical to that in 
the standard mechanism (i.e., without exploration), allowing us to compare and contrast the various metrics 
at the corresponding SNEs. We show that while the actual bid profiles of Exp-GSP and GSP may differ at 
the corresponding SNEs, the macroscopic measures, such as revenue, efficiency etc. do not differ signifi- 
cantly, allowing auctioneers to limit the cost of uncertainty. The approach in ifTSl . on the other hand, centers 
around showing that both the mechanisms (i.e., the standard GSP and the proposed exploratory randomized 
mechanism ) would share almost-identical equilibrium bid profiles; of course, the auctioneer still pays a 
price for learning the quality factors (as in our case). These two different approaches to the design of ex- 
ploratory mechanisms raise an important topic for future work: what other exploratory mechanisms can one 
design, and are their lower bounds on the cost or price of uncertainty? That is, can one design mechanisms 
that have the optimal characteristics when it comes to revenue loss vs. the information gathered about qual- 
ity factors and valuations. Clearly, more work is necessary and more mechanisms such as those proposed 
herein and in lITSl need to be studied. 
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Appendix 

Proof of Lemma |3j Let m < L, then 

dm = (n- L- {m- l))7L+m + 7L+1 + lL+2 H h jL+m-l 

dm+1 = {n- L- m)-fL+m+l + IL+I + lL+2 H h 7L+m 

.■- dm - dm+1 = {n - L- m){-ym+L - Tm+l+i) 
As we have 7j > 'yj+i for all 1 < j < i^, we get 



whenever m < n — L and m < K — L and therefore we have 

di > d2 > ■ ■ ■ > di-i > di 

whenever L < ^ min{n, K + I}. 
For L < m < n — L, 

dm = (m- L)7„ + 7m+i H h 7m+L-i + (n - m - L + l)7m+L 

dm+1 = (m + 1 - L)jm+1 + 7m+2 H h 7m+L + (n - 171 - L)jm+L+1 

.-. dm - dm+1 = {in- L)(7m - 'jm+i) + {n-m- L){'^m+L - 7m+l+L) 

dL - di+i = (n - 2L)(72L - 72L+1) 
> whenever n > 2L and 2L < K. 

For, L < m < n — L, clearly [n — m — L){'jm+L — Im+i+i) > 0, and {m — L){'^m — Im+i) > whenever 
m < K and therefore dm > dm+i whenever n < K + L + 1. 

:. dL > di+l > ■ ■ ■ > dn-L 

whenever L < ^ min{n — 1, K} and n < K + L + 1. 
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Further, for n — L < m < n — 1, 

dm = {m- L)-fjn + 7m+l H ^ 7n 

dm+1 = (m + 1 - L)-frn+l + lm+2 H h 7n 

(im - = {m- L)(7m - 7m+l) 

.'. (im > dm+1 whenever m < K 



.-. dn-L > dn^L+1 > ■ ■ ■ > dn whenever n < K + I. 
Combining the above relations and noting that = 7 + dj for all 1 < j < n, we obtain 

Oj > 6j+i for alll < J < n - 1 
whenever L < — min{n — 1, K} and n < K + 1. 

Now, 9n — 6n+i = 7 + (n — L)^n — > whenever L > {) or n < K and for j > n , Oj 

n{jj — 7j+i) > whenever j < K and 6j — Oj-^i is otherwise. This completes the proof.l 
Proof of Theorem lit 

Now from proof of Lemma[2l we can observe that 

- Gj+i = 

(n-j-L) {-fj+L - Ij+i+h) -J <L 

(j - L){-fj - 7j+i) + L){jj+L - U+i+l) ■,L<j<n-L 

U - L){^j - 7j+i) ;n - L < j < n 

(7 - L7„+i) + (n - L)(7„ - 7„+i) d = n 

. n(7j - 7j+i) -J >n 



(I _ j+L \ fyj+L-'yj+i+L \ 

tzL ^ M _ i±L^nj±Lzl2+l±^) ■^L<j<n-L 



;i < L 



7j-7i+l 



1 (7-L7„+i) _|_ Q _ 
1 



■,n — L < j < n 

■J = n 

;n < j < K 



Let 



mm 



- Ij+l+L 



l<j<n-L 7j - 7j+i 



then 



- 7i+i) 



> 



1-^ 



;i < L 

■,L<j<n — L 
■,n — L < j < n 
■,j = n 
;n < j < K 
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> < 



( /-I 

(1- 


- —)c 


;i < L 


(1 - 


- — )min|l,c} 


, L < J < n — L 


< (1- 


_ 2L\ 
n ' 


]n — L < i < n 


1 - 


L 

n 


■,j = n 


1 

v. 




;n < j < K 


> (1 


-f^)mm{l,c} 


■A<j<K 



Therefore, for all I < j < K, we have 

(7i-7,+i)-^(^,-^,+i) 

< (l-min{l,c}(l-f )) (7,-7,+i)- 

•'• -^0 — = 

Ef=i Ef=. It [(7, - 7,+i) - ^(^, - ^,+i)] qj^ivj^i 
< Ef=i Ef=s t [(7, - 7,+i) - ^(0, - Qj+iv,+i 



^ ^ (1 - c}(l - — )) (7j - -fj+i)qj+iVj+i 



<(l-min{l,c}(l-f))iir^"'^>, 

where Rq denotes the revenue of auctioneer from top / bidders in GSP 
^9^<(l-min{l,c}(l-^))^^ 



Ro 



< (l-mm{l,c}(l-f^)). 

Proof of Theorem [lOl 

Using Lemma|9]we have, 



= Em=l 7memt'm (^1 - 



Let us first assume that 



emVm > em+iVm+1 for all 1 < m < n. 
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For m < L,we have 



For L < m < n. 



1 Vr, 



1 Vm _ 1 J2i=l ^i'^i 

> 1 S"=l gj^i 
— n eii^i 

where a = iSLifi!!!. 

n eivi 

m—1 



> i 



n e-rnVm — n 



where 



For n < m < K, 
Therefore, 



CO = mm 1 . 



1. 



1 Vn 



Eq- -E = Y\ ImemV: 

n , 

m=L+l ^ 
m=n+l 



1 - - 



1 yn 



}__yrn_ 
IT' f^m'^m 

1 Vm 



< (1 - «) X] '^"^^^ 



m=l 



n ^ 

m=L+l 

= {\-a)El~^u:El^ 
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where = J2i=l lm.em.Vm and EJ^'' = Yh=L+1 lm€-m.Vm- 

But it might be the case that the Equation [25] does not hold. In this case, we have for L < m < n, 

where (3 = —- ^*=i 



and for L < m < n, 



n maxi<m<L emVr, 



, 1 ym , 

1 < T] 



where rj = max I max ( 1 ^— ^ 

L<m<n I m—L<i<m \ GrnVm 



■.Eo-^E<il-(3)E'o+vK'^- 
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